General Statistics
Overview
9,798
Total Cells
14,589
Total Genes
2,000
Highly Variable Genes
0
Median Genes/Cell
0
Median UMIs/Cell
7/7
Analysis Steps
📋 Dataset Summary: This single-cell RNA-seq dataset contains
9,798 cells and 14,589 genes.
After quality control and feature selection, 2,000 highly variable genes
(13.7% of total) were identified for downstream analysis.
Gene Expression Analysis
Feature SelectionGene Expression Overview and HVG Selection
✅ Feature Selection Results: 2,000 highly variable genes were selected
from 14,589 total genes (13.7%). These features will be used
for dimensionality reduction and downstream analyses.
Principal Component Analysis
Dimensionality ReductionPCA Results and Variance Explained
🔧 PCA Parameters
• Number of components: 50• Data layer: scaled
• Use highly variable genes: True
Batch Effect Correction
IntegrationBatch Correction Comparison: Before and After Integration
🔄 Integration Methods Applied: Multiple batch correction methods were evaluated.
X_scVI was selected as the
optimal integration method based on benchmarking metrics.
🔧 Integration Parameters
• Harmony PCs: 50• scVI latent dimensions: 30
• scVI layers: 2
• Best method: X_scVI
Cell Clustering
7 ClustersFinal Clustering Results
🎯 Clustering Summary: Automated clustering identified
7 distinct cell clusters using the SCCAF algorithm
with Leiden clustering. Results are visualized using MDE (Minimum Distortion Embedding).
Cell Cycle Analysis
Phase DistributionCell Cycle Phase Distribution and Scores
| Cell Cycle Phase | Cell Count | Percentage | Status |
|---|---|---|---|
| G1 | 4,813 | 49.1% | ✅ Normal |
| S | 2,792 | 28.5% | ✅ Normal |
| G2M | 2,193 | 22.4% | ✅ Normal |
Integration Method Benchmark
Auto-Selected
🏆 Best Method: X_scVI
was automatically selected as the integration method. Detailed benchmarking metrics are not available.
🔧 Available Integration Methods
• Harmony: ✅ Available• scVI: ✅ Available
• Selected: X_scVI
Analysis Pipeline Status
Workflow| Analysis Step | Status | Parameters |
|---|---|---|
| 🔍 Quality Control & Filtering | ✅ Completed | mode: seurat; min_cells: 3; min_genes: 200 (+ 10 more) |
| ⚙️ Preprocessing & Normalization | ✅ Completed | mode: shiftlog|pearson; target_sum: 500000.0; n_HVGs: 2000 (+ 1 more) |
| 📏 Data Scaling | ✅ Completed | Default parameters |
| 📈 Principal Component Analysis | ✅ Completed | layer: scaled; n_pcs: 50 |
| 🔄 Cell Cycle Scoring | ✅ Completed | s_genes: ['Cdca7', 'Mcm4', 'Mcm7', 'Rfc2', 'Ung', 'Mcm6', 'Rrm1', 'Slbp', 'Pcna', 'Atad2', 'Tipin', 'Mcm5', 'Uhrf1', 'Polr1b', 'Dtl', 'Prim1', 'Fen1', 'Hells', 'Gmnn', 'Pold3', 'Nasp', 'Chaf1b', 'Gins2', 'Pola1', 'Msh2', 'Casp8ap2', 'Cdc6', 'Ubr7', 'Ccne2', 'Wdr76', 'Tyms', 'Cdc45', 'Clspn', 'Rrm2', 'Dscc1', 'Rad51', 'Usp1', 'Exo1', 'Blm', 'Rad51ap1', 'Cenpu', 'E2f8', 'Mrpl36']; g2m_genes: ['Cbx5', 'Aurkb', 'Cks1b', 'Cks2', 'Jpt1', 'Hmgb2', 'Anp32e', 'Lbr', 'Tmpo', 'Top2a', 'Tacc3', 'Tubb4b', 'Ncapd2', 'Rangap1', 'Cdk1', 'Smc4', 'Kif20b', 'Cdca8', 'Ckap2', 'Ndc80', 'Dlgap5', 'Hjurp', 'Ckap5', 'Bub1', 'Ckap2l', 'Ect2', 'Kif11', 'Birc5', 'Cdca2', 'Nuf2', 'Cdca3', 'Nusap1', 'Ttk', 'Aurka', 'Mki67', 'Pimreg', 'Ccnb2', 'Tpx2', 'Hjurp', 'Anln', 'Kif2c', 'Cenpe', 'Gtse1', 'Kif23', 'Cdc20', 'Ube2c', 'Cenpf', 'Cenpa', 'Hmmr', 'Ctcf', 'Psrc1', 'Cdc25c', 'Nek2', 'Gas2l3', 'G2e3'] |
| 🎵 Harmony Integration | ✅ Completed | n_pcs: 50 |
| 🧬 scVI Integration | ✅ Completed | n_layers: 2; n_latent: 30; gene_likelihood: nb |
| 📊 Method Benchmarking | ❌ Not Completed | Default parameters |
| 🎯 SCCAF Clustering Analysis | ❌ Not Completed | Default parameters |
📋 Pipeline Summary: This analysis was completed using the OmicVerse lazy function pipeline.
The pipeline automatically performed quality control, normalization, batch correction, clustering, and benchmarking
to provide comprehensive single-cell RNA-seq analysis results.